The Fulmine Delta Protocol

Fulmine is a distributed data-model system that uses record entities as the building blocks for the data-model. The records are transmitted between Fulmine instances using a custom protocol.

Overview

The protocol transmits record entities as a frame. A frame is a sequence of bytes. The length of the frame is variable. Record entities are flat data structures composed of fields with data. Each field in a record is assigned an integer field code. The protocol uses the field codes when constructing a frame. This enables more compact representation of the record by using field codes.

The data for each field is further compacted, for integrals only, in that only the required number of bytes to successfully reconstruct the value are sent; the integral 120, whilst represented as 4 bytes in memory only needs 1 byte for transmission.

A record entity indicates which fields are ‘dirty’ (i.e. should be transmitted because they have changed since the last transmission). Only ‘dirty’ fields are transmitted, thus enabling a smaller frame size. Thus only a delta of the record is transmitted.

Frame Structure

A frame is composed of three segments; preamble, header, data.

Preamble

The preamble serves to describe what record type the frame is transmitting and the length of the header and data segments. The pre-amble has a dynamic section and a static section. The dynamic section is a variable-length segment that describes the record identity (name) and how many bytes compose the record name. The Record ID is always a string (encoding is covered later on).

Static section

The static section is formatted as shown below.

The header and data size identify the byte sizes of the header and data segments, respectively.

Header

The header identifies the fields that are transmitted in the frame and where the data for the field exists in the data segment. The structure is composed of repeating field specification sub-structures. There is one field specification for each field that is transmitted in the frame.

A field specification contains 4 segments; a field byte count (FBC), data position byte count (DBC), the field code and the data position.

Segment	Meaning
FBC	The number of bytes to read to resolve the compacted field code.
DBC	The number of bytes to read to resolve the compacted data position.
Field code	The field code that identifies the field, specified as a compacted integer (see String wire format field specification). How the code is transmitted and registered is not defined explicitly by the protocol.
Data position	The data position is the value of the position in the data segment where the data for the field occurs. This value is specified as a compacted integer.

The concept of compacted integers is described in Compacted Integral Transmission.

String wire format field specification

There are two ‘flavours’ of field specification; integer wire format and string wire format. The difference is that an integer wire format field spec only sends an integer code to identify the field in the record. A string wire format field spec sends the field name as a string. The frame below shows the structure of a string wire format field spec.

There is no indication in the field specification as to whether it is a string or integer wire format; the field specification is interpreted as either string or integer by application code. The only use of string wire format field specs in Fulmine is for transmitting the container definition; this describes the fields of a record (container).

Data

The data segment is a contiguous block of data for all the fields that are sent in the frame. Each field specification in the header includes the position in the data segment where the field data occurs.

The length of data to read for each field data is calculated by subtracting the data position of the next field specification from the current one; e.g. the length of field data 1 = (data position 2 - data position 1). If this is the last field spec in the header, the length of data terminates with the end of the data segment.

Data formats

The following table lists the data formats for the full range of supported data types. The data for the fields transmitted in a frame must be in one of these formats.

Data type	Size (bytes)	Data format
Boolean	1	0x01 for true, 0x00 for false
Signed integral up to 64 bits	1-8	Compacted integral, big-endian
Single precision, signed float	4	Single precision, IEEE 754
Double precision, signed float	8	Double precision, IEEE 754
String	?	UTF8
Record (nested)	?	Record frame

Nested records

Nested records are transmitted as nested record frames, as specified below. The nested record wire format structure follows exactly as with a standard record but without a preamble; a nested record has no identity so no preamble is required.

The only current use of nested records is for transmitting the container definition. Records are flat structures.

Compacted Integral Transmission

The compacted integral transmission technique transmits integrals using the minimum number of bytes required by truncating all leading bytes that are zero. The integrals must be represented in big-endian byte order.

It operates under the notion that the receiver of the frame can perform the data ‘inflation’ by adding the missing bytes. For this, the receiver knows the data size that the compacted integral should be. The ‘inflation’ process is a simple bitwise OR operation of the true data size bytes with the compacted integral bytes.

NOTE: there is no special treatment for float and doubles. This is because the binary fraction part will more likely have a repeating expansion so no optimisation for throwing away bytes would work. The effort required to add this is not justified seeing as the only non-repeating expansion is when dividing a number by a power of 2.

The examples below illustrate the compacted integral transmission technique.

Negative Integrals

Negative integrals must be represented as 2’s compliment. The bytes representing the negative integral are inverted and all leading 0 bytes are truncated. The remaining bytes are re-inverted and sent. On the receiver side, a bitwise AND operation is performed on the transmitted bytes and -1.